生成模型生成的合成数据可以增强医学成像中渴望数据深度学习模型的性能和能力。但是,(1)(合成)数据集的可用性有限,并且(2)生成模型训练很复杂,这阻碍了它们在研究和临床应用中的采用。为了减少此入口障碍,我们提出了Medigan,Medigan是一站式商店,用于验证的生成型号,该型号是开源框架 - 不合骨python图书馆。 Medigan允许研究人员和开发人员仅在几行代码中创建,增加和域名。在基于收集的最终用户需求的设计决策的指导下,我们基于生成模型的模块化组件(i)执行,(ii)可视化,(iii)搜索和排名以及(iv)贡献。图书馆的可伸缩性和设计是通过其越来越多的综合且易于使用的验证生成模型来证明的,该模型由21种模型组成,利用9种不同的生成对抗网络体系结构在4个域中在11个数据集中训练,即乳腺摄影,内窥镜检查,X射线和X射线和X射线镜头,X射线和X型。 MRI。此外,在这项工作中分析了Medigan的3个应用,其中包括(a)启用社区范围内的限制数据共享,(b)研究生成模型评估指标以及(c)改进临床下游任务。在(b)中,扩展了公共医学图像综合评估和报告标准,我们根据图像归一化和特定于放射学特征提取了Fr \'Echet Inception距离变异性。
translated by 谷歌翻译
Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths. We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered cortical hierarchies. This is achieved by exploiting the noise naturally found in biophysical systems as an additional carrier of information. In our dynamical system, all weights are learned simultaneously with always-on plasticity and using only information locally available to the synapses. Our method is completely phase-free (no forward and backward passes or phased learning) and allows for efficient error propagation across multi-layer cortical hierarchies, while maintaining biologically plausible signal transport and learning. Our method is applicable to a wide class of models and improves on previously known biologically plausible ways of credit assignment: compared to random synaptic feedback, it can solve complex tasks with less neurons and learn more useful latent representations. We demonstrate this on various classification tasks using a cortical microcircuit model with prospective coding.
translated by 谷歌翻译
This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute5 (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases.
translated by 谷歌翻译
The Predicting Media Memorability task in the MediaEval evaluation campaign has been running annually since 2018 and several different tasks and data sets have been used in this time. This has allowed us to compare the performance of many memorability prediction techniques on the same data and in a reproducible way and to refine and improve on those techniques. The resources created to compute media memorability are now being used by researchers well beyond the actual evaluation campaign. In this paper we present a summary of the task, including the collective lessons we have learned for the research community.
translated by 谷歌翻译
This paper presents an automatic approach to creating taxonomies of technical terms based on the Cooperative Patent Classification (CPC). The resulting taxonomy contains about 170k nodes in 9 separate technological branches and is freely available. We also show that a Text-to-Text Transfer Transformer (T5) model can be fine-tuned to generate hypernyms and hyponyms with relatively high precision, confirming the manually assessed quality of the resource. The T5 model opens the taxonomy to any new technological terms for which a hypernym can be generated, thus making the resource updateable with new terms, an essential feature for the constantly evolving field of technological terminology.
translated by 谷歌翻译
In this work, we propose a framework relying solely on chat-based customer support (CS) interactions for predicting the recommendation decision of individual users. For our case study, we analyzed a total number of 16.4k users and 48.7k customer support conversations within the financial vertical of a large e-commerce company in Latin America. Consequently, our main contributions and objectives are to use Natural Language Processing (NLP) to assess and predict the recommendation behavior where, in addition to using static sentiment analysis, we exploit the predictive power of each user's sentiment dynamics. Our results show that, with respective feature interpretability, it is possible to predict the likelihood of a user to recommend a product or service, based solely on the message-wise sentiment evolution of their CS conversations in a fully automated way.
translated by 谷歌翻译
In this paper we present a novel multi-attribute face manipulation method based on textual descriptions. Previous text-based image editing methods either require test-time optimization for each individual image or are restricted to single attribute editing. Extending these methods to multi-attribute face image editing scenarios will introduce undesired excessive attribute change, e.g., text-relevant attributes are overly manipulated and text-irrelevant attributes are also changed. In order to address these challenges and achieve natural editing over multiple face attributes, we propose a new decoupling training scheme where we use group sampling to get text segments from same attribute categories, instead of whole complex sentences. Further, to preserve other existing face attributes, we encourage the model to edit the latent code of each attribute separately via an entropy constraint. During the inference phase, our model is able to edit new face images without any test-time optimization, even from complex textual prompts. We show extensive experiments and analysis to demonstrate the efficacy of our method, which generates natural manipulated faces with minimal text-irrelevant attribute editing. Code and pre-trained model will be released.
translated by 谷歌翻译
最近,致力于通过现代机器学习方法预测脑部疾病的最新神经影像学研究通常包括单一模态并依靠监督的过度参数化模型。但是,单一模态仅提供了高度复杂的大脑的有限视图。至关重要的是,临床环境中的有监督模型缺乏用于培训的准确诊断标签。粗标签不会捕获脑疾病表型的长尾谱,这导致模型的普遍性丧失,从而使它们在诊断环境中的有用程度降低。这项工作提出了一个新型的多尺度协调框架,用于从多模式神经影像数据中学习多个表示。我们提出了一般的归纳偏见分类法,以捕获多模式自学融合中的独特和联合信息。分类法构成了一个无解码器模型的家族,具有降低的计算复杂性,并捕获多模式输入的本地和全局表示之间的多尺度关系。我们使用各种阿尔茨海默氏病表型中使用功能和结构磁共振成像(MRI)数据对分类法进行了全面评估,并表明自我监督模型揭示了与疾病相关的大脑区域和多模态链接,而无需在预先访问PRE-PRE-the PRE-the PRE-the PRE-the PRE-PRECTEN NICKES NOCKER NOCKER NOCKER NOCKER NOCKER NOCE访问。训练。拟议的多模式自学学习的学习能够表现出两种模式的分类表现。伴随的丰富而灵活的无监督的深度学习框架捕获了复杂的多模式关系,并提供了符合或超过更狭窄的监督分类分析的预测性能。我们提供了详尽的定量证据,表明该框架如何显着提高我们对复杂脑部疾病中缺失的联系的搜索。
translated by 谷歌翻译
在(特殊的)平滑样条问题中,一个人考虑了二次数据保真惩罚和拉普拉斯正则化的变异问题。可以通过用聚拉普拉斯的正规机构代替拉普拉斯的常规机构来获得较高的规律性。该方法很容易适应图,在这里,我们考虑在完全监督的,非参数,噪声损坏的回归问题中图形多拉普拉斯正则化。特别是,给定一个数据集$ \ {x_i \} _ {i = 1}^n $和一组嘈杂的标签$ \ {y_i \} _ {i = 1}^n \ subset \ subset \ mathbb {r}令$ u_n:\ {x_i \} _ {i = 1}^n \ to \ mathbb {r} $是由数据保真项组成的能量的最小化器,由数据保真术语和适当缩放的图形poly-laplacian项组成。当$ y_i = g(x_i)+\ xi_i $,对于IID噪声$ \ xi_i $,并使用几何随机图,我们在大型中识别(高概率)$ u_n $ to $ g $的收敛速率数据限制$ n \ to \ infty $。此外,我们的速率(到对数)与通常的平滑样条模型中已知的收敛速率相吻合。
translated by 谷歌翻译
车载传感器的车载系统正在增强连接。这使信息共享能够实现对环境的更全面的理解。但是,通过公共蜂窝网络的同行通信带来了多个网络障碍以解决,需要网络系统来中继通信并连接无法直接连接的各方。 Web实时通信(WEBRTC)是跨车辆流媒体流媒体的良好候选者,因为它可以使延迟通信较低,同时将标准协议带到安全握手中,发现公共IP和横向网络地址转换(NAT)系统。但是,在基础架构中的端到端服务质量(QOS)适应,在该基础架构中,传输和接收是通过继电器解耦的,需要一种机制来有效地使视频流适应网络容量。为此,本文通过利用实时运输控制协议(RTCP)指标(例如带宽和往返时间)来调查解决分辨率,帧和比特率更改的机制。该解决方案旨在确保接收机上系统及时获得相关信息。在实际的5G测试台中分析了应用不同方法适应方法时对端到端吞吐量效率和反应时间的影响。
translated by 谷歌翻译